{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "# Fuzzy cognitive diagnosis framework (FuzzyCDF)\n",
    "\n",
    "This notebook will show you how to train and use the FuzzyCDF.\n",
    "First, we will show how to get the data (here we use Math1 from math2015 as the dataset).\n",
    "Then we will show how to train a FuzzyCDF and perform the parameters persistence.\n",
    "At last, we will show how to load the parameters from the file and evaluate on the test dataset.\n",
    "\n",
    "The script version could be found in [FuzzyCDF.py](FuzzyCDF.ipynb)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data Preparation\n",
    "\n",
    "Before we process the data, we need to first acquire the dataset which is shown in [prepare_dataset.ipynb](prepare_dataset.ipynb)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load the data from files\n",
    "import numpy as np\n",
    "import json\n",
    "\n",
    "# type of problems\n",
    "obj_prob_index = np.loadtxt(\"../../data/math2015/Math1/obj_prob_index.csv\", delimiter=',', dtype=int)\n",
    "sub_prob_index = np.loadtxt(\"../../data/math2015/Math1/sub_prob_index.csv\", delimiter=',', dtype=int)\n",
    "# Q matrix\n",
    "q_m = np.loadtxt(\"../../data/math2015/Math1/q_m.csv\", dtype=int, delimiter=',')\n",
    "prob_num, know_num = q_m.shape[0], q_m.shape[1]\n",
    "\n",
    "# training data\n",
    "with open(\"../../data/math2015/Math1/train_data.json\", encoding='utf-8') as file:\n",
    "    train_set = json.load(file)\n",
    "stu_num = max([x['user_id'] for x in train_set]) + 1\n",
    "R = -1 * np.ones(shape=(stu_num, prob_num))\n",
    "for log in train_set:\n",
    "    R[log['user_id'], log['item_id']] = log['score']\n",
    "\n",
    "# testing data\n",
    "with open(\"../../data/math2015/Math1/test_data.json\", encoding='utf-8') as file:\n",
    "    test_set = json.load(file)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'user_id': 0, 'item_id': 7, 'score': 1.0} {'user_id': 0, 'item_id': 9, 'score': 1.0}\n"
     ]
    }
   ],
   "source": [
    "print(train_set[0], test_set[0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(67344, 16836)"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(train_set), len(test_set)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## Training and Persistence"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "import logging\n",
    "logging.getLogger().setLevel(logging.INFO)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:root:save parameters to fuzzycdf.params\n"
     ]
    }
   ],
   "source": [
    "from EduCDM import FuzzyCDF\n",
    "\n",
    "cdm = FuzzyCDF(R, q_m, stu_num, prob_num, know_num, obj_prob_index, sub_prob_index, skip_value=-1)\n",
    "\n",
    "cdm.train(epoch=10, burnin=5)\n",
    "cdm.save(\"fuzzycdf.params\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## Loading and Testing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:root:load parameters from fuzzycdf.params\n",
      "evaluating: 100%|█████████████████████████████████████████████████████████████| 16836/16836 [00:00<00:00, 91552.55it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "RMSE: 0.447697, MAE: 0.405684\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "cdm.load(\"fuzzycdf.params\")\n",
    "rmse, mae = cdm.eval(test_set)\n",
    "print(\"RMSE: %.6f, MAE: %.6f\" % (rmse, mae))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Incremental Training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "new_data = [{'user_id': 0, 'item_id': 2, 'score': 0.0}, {'user_id': 1, 'item_id': 1, 'score': 1.0}]\n",
    "cdm.inc_train(new_data, epoch=10, burnin=5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
